The CSV file of the GPS data set is here. Go here if you want to copy and paste this CSV file to your computer.

There are 1841 coordinates in this file from 1839 unique cases (i.e. 22 % of the total number of cases, 8288, reported in PACS). By year, the GPS data split like this:

Registered S3 method overwritten by 'cli':
  method     from    
  print.boxx spatstat
# A tibble: 8 x 4
   year     n   gps  perc
  <dbl> <int> <int> <dbl>
1  2012   511    93 18.2 
2  2013  2216   595 26.9 
3  2014   130     5  3.85
4  2015   570   122 21.4 
5  2016  1291   280 21.7 
6  2017  2170   603 27.8 
7  2018   835    60  7.19
8    NA   565    81 14.3 

Considering only confirmed cases, it looks like:

# A tibble: 8 x 4
   year     n   gps  perc
  <dbl> <int> <int> <dbl>
1  2012   203    93  45.8
2  2013  1248   573  45.9
3  2014    16     5  31.2
4  2015   267   122  45.7
5  2016   250   137  54.8
6  2017  1323   586  44.3
7  2018   317    60  18.9
8    NA   134    24  17.9

The duplicates are

# A tibble: 4 x 4
     id source   longitude latitude
  <int> <chr>        <dbl>    <dbl>
1  7681 server        103.     18.0
2  7681 server        103.     18.1
3  6060 whatsapp      103.     18.1
4  6060 whatsapp      103.     18.0

There are 81 geocoded cases for which we don’t have any date:

# A tibble: 81 x 5
      id onset      hospitalization consultation sample_collection
   <int> <date>     <date>          <date>       <date>           
 1   937 NA         NA              NA           NA               
 2   941 NA         NA              NA           NA               
 3  1112 NA         NA              NA           NA               
 4  1211 NA         NA              NA           NA               
 5  1281 NA         NA              NA           NA               
 6  1282 NA         NA              NA           NA               
 7  1286 NA         NA              NA           NA               
 8  1489 NA         NA              NA           NA               
 9  1572 NA         NA              NA           NA               
10  1869 NA         NA              NA           NA               
# … with 71 more rows

The CSV file of these cases is here. Go here if you want to copy and paste this CSV file to your computer. The split by source reads

# A tibble: 4 x 2
  source       n
  <chr>    <int>
1 new_gps    821
2 old_gps    736
3 server      31
4 whatsapp   253

The split of data according to the test and the availability of GPS coordinates is:

       test
gps     negative not tested positive
  FALSE     1491       2800     2158
  TRUE         7        232     1600

The cases with GPS data and reported as negative are:

[1] 2218 2303 2375 2423 4925 5052 7097

Here is a map of the geolocated cases:

The blue crosses are the Vayakorn Inn and the Institut Pasteur du Laos. Same with satellite image background: